Extending, Trimming and Fusing WordNet for Technical Documents
نویسنده
چکیده
This paper describes a tool for the automatic extension and trimming of a multilingual WordNet database for cross-lingual retrieval and multilingual ontology building in intranets and domain-specific document collections. Hierarchies, built from automatically extracted terms and combined with the WordNet relations, are trimmed with a disambiguation method based on the document salience of the words in the glosses. The disambiguation is tested in a cross-lingual retrieval task, showing considerable improvement (7%-11%). The condensed hierarchies can be used as browse-interfaces to the documents complementary to retrieval.
منابع مشابه
FAQFinder with sense tagging FAQFinder without sense tagging
Rejection Recall FAQFinder with sense tagging FAQFinder without sense tagging Figure 4: Recall vs. Rejection for FAQFinder with and without WordNet Sense Tagging search. In FAQFinder, sense tagging and calculation of semantic similarity are much more computationally intensive than term vector processing. However, since FAQFinder matches single questions rather than entire documents, the computa...
متن کاملحسنگار : شبکه واژگان حسی فارسی
Awareness of others' opinions plays a crucial role in the decision making process performed by simple customers to top-level executives of manufacturing companies and various organizations. Today, with the advent of Web 2.0 and the expansion of social networks, a vast number of texts related to people's opinions have been created. However, exploring the enormous amount of documents, various opi...
متن کاملSmall Is Powerful! Towards a Refinedly Enriched Ontology by Careful Pruning and Trimming
In this paper, we study how to better merge a WordNet-like ontology with an online encyclopedia. We first eliminate the noises with some heuristic rules, and then adopt a domain-dependent strategy to trim the encyclopedia structure. Finally, we integrate entities from the trimmed structure into the original ontology, and construct a refinedlyenriched ontology. The experimental results show that...
متن کاملExtending a wordnet framework for simplicity and scalability
The WordNet knowledge model is currently implemented in multiple software frameworks providing procedural access to language instances of it. Frameworks tend to be focused on structural/design aspects of the model thus describing low level interfaces for linguistic knowledge retrieval. Typically the only high level feature directly accessible is word lookup while traversal of semantic relations...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کامل